skip to main content


Search for: All records

Creators/Authors contains: "Zhang, Ziqi"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. SUMMARY

    Seismic interrogation of the upper mantle from the base of the crust to the top of the mantle transition zone has revealed discontinuities that are variable in space, depth, lateral extent, amplitude and lack a unified explanation for their origin. Improved constraints on the detectability and properties of mantle discontinuities can be obtained with P-to-S receiver function (Ps-RF) where energy scatters from P to S as seismic waves propagate across discontinuities of interest. However, due to the interference of crustal multiples, uppermost mantle discontinuities are more commonly imaged with lower resolution S-to-P receiver function (Sp-RF). In this study, a new method called CRISP-RF (Clean Receiver-function Imaging using SParse Radon Filters) is proposed, which incorporates ideas from compressive sensing and model-based image reconstruction. The central idea involves applying a sparse Radon transform to effectively decompose the Ps-RF into its underlying wavefield contributions, that is direct conversions, multiples, and noise, based on the phase moveout and coherence. A masking filter is then designed and applied to create a multiple-free and denoised Ps-RF. We demonstrate, using synthetic experiment, that our implementation of the Radon transform using a sparsity-promoting regularization outperforms the conventional least-squares methods and can effectively isolate direct Ps conversions. We further apply the CRISP-RF workflow on real data, including single station data on cratons, common-conversion-point stack at continental margins and seismic data from ocean islands. The application of CRISP-RF to global data sets will advance our understanding of the enigmatic origins of the upper mantle discontinuities like the ubiquitous mid-lithospheric discontinuity and the elusive X-discontinuity.

     
    more » « less
  2. Abstract

    The Earth, in large portions, is covered in oceans, sediments, and glaciers. High‐resolution body wave imaging in such environments often suffers from severe reverberations, that is, repeating echoes of the incoming scattered wavefield trapped in the reverberant layer, making interpretation of lithospheric layering difficult. In this study, we propose a systematic data‐driven approach, using autocorrelation and homomorphic analysis, to solve the twin problem of detection and elimination of reverberations without a priori knowledge of the elastic structure of the reverberant layers. We demonstrate, using synthetic experiments and data examples, that our approach can effectively identify the signature of reverberations even in cases where the recording seismic array is deployed in complex settings, for example, using data from (a) a land station sitting on Songliao basin, (b) an ocean bottom station in the fore‐arc setting of the Alaska amphibious community seismic experiment, and (c) a station deployed on ice‐sediment strata in the glaciers of Antarctica. The elimination of the reverberation is implemented by a frequency domain filter whose parameters are automatically tuned using seismic data alone. On glaciers where the reverberating sediment layer is sandwiched between the lithosphere and an overlying ice layer, homomorphic analysis is preferable in detecting the signature of reverberation. We expect that our technique will see wide application for high‐resolution body wave imaging across a wide variety of conditions.

     
    more » « less
  3. Abstract

    It is a challenging task to integrate scRNA-seq and scATAC-seq data obtained from different batches. Existing methods tend to use a pre-defined gene activity matrix to convert the scATAC-seq data into scRNA-seq data. The pre-defined gene activity matrix is often of low quality and does not reflect the dataset-specific relationship between the two data modalities. We propose scDART, a deep learning framework that integrates scRNA-seq and scATAC-seq data and learns cross-modalities relationships simultaneously. Specifically, the design of scDART allows it to preserve cell trajectories in continuous cell populations and can be applied to trajectory inference on integrated data.

     
    more » « less
  4. Abstract

    Single cell data integration methods aim to integrate cells across data batches and modalities, and data integration tasks can be categorized into horizontal, vertical, diagonal, and mosaic integration, where mosaic integration is the most general and challenging case with few methods developed. We propose scMoMaT, a method that is able to integrate single cell multi-omics data under the mosaic integration scenario using matrix tri-factorization. During integration, scMoMaT is also able to uncover the cluster specific bio-markers across modalities. These multi-modal bio-markers are used to interpret and annotate the clusters to cell types. Moreover, scMoMaT can integrate cell batches with unequal cell type compositions. Applying scMoMaT to multiple real and simulated datasets demonstrated these features of scMoMaT and showed that scMoMaT has superior performance compared to existing methods. Specifically, we show that integrated cell embedding combined with learned bio-markers lead to cell type annotations of higher quality or resolution compared to their original annotations.

     
    more » « less
  5. Yann, Ponty (Ed.)
    Abstract Motivation The study of the evolutionary history of biological networks enables deep functional understanding of various bio-molecular processes. Network growth models, such as the Duplication–Mutation with Complementarity (DMC) model, provide a principled approach to characterizing the evolution of protein–protein interactions (PPIs) based on duplication and divergence. Current methods for model-based ancestral network reconstruction primarily use greedy heuristics and yield sub-optimal solutions. Results We present a new Integer Linear Programming (ILP) solution for maximum likelihood reconstruction of ancestral PPI networks using the DMC model. We prove the correctness of our solution that is designed to find the optimal solution. It can also use efficient heuristics from general-purpose ILP solvers to obtain multiple optimal and near-optimal solutions that may be useful in many applications. Experiments on synthetic data show that our ILP obtains solutions with higher likelihood than those from previous methods, and is robust to noise and model mismatch. We evaluate our algorithm on two real PPI networks, with proteins from the families of bZIP transcription factors and the Commander complex. On both the networks, solutions from our ILP have higher likelihood and are in better agreement with independent biological evidence from other studies. Availability and implementation A Python implementation is available at https://bitbucket.org/cdal/network-reconstruction. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  6. null (Ed.)
    Adversarial training is an effective defense method to protect classification models against adversarial attacks. However, one limitation of this approach is that it can re- quire orders of magnitude additional training time due to high cost of generating strong adversarial examples dur- ing training. In this paper, we first show that there is high transferability between models from neighboring epochs in the same training process, i.e., adversarial examples from one epoch continue to be adversarial in subsequent epochs. Leveraging this property, we propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA), that can enhance the robustness of trained models and greatly improve the training efficiency by accumulating adversarial perturbations through epochs. Compared to state-of-the-art adversarial training methods, ATTA enhances adversarial accuracy by up to 7.2% on CIFAR10 and requires 12 ∼ 14× less training time on MNIST and CIFAR10 datasets with comparable model robustness. 
    more » « less
  7. Abstract

    While the receiver function technique has been successfully applied to high‐resolution imaging of sharp discontinuities within and across the lithosphere, it suffers from severe limitations when applied to seafloor seismic recordings. This is because the water and sediment layer could strongly influence the receiver function traces, making detection and interpretation of crust and mantle layering difficult. This effect is often referred to as the singing phenomena in marine environments. We demonstrate, using analytical and synthetic modeling, that this singing effect can be reversed using a selective dereverberation filter tuned to match the elastic property of each layer. We apply the dereverberation filter to high‐quality earthquake records collected from the NoMelt seismic array deployed on normal, mature Pacific seafloor. An appropriate filter designed using the elastic properties of the underlying sediments, obtained from prior studies, greatly improves the detection of Ps conversions from the Moho (∼8.6 km) and from a sharp discontinuity (<∼5 km) across the lithosphere asthenosphere transition (∼72 km). Sensitivity tests show that the dereverberation filter is mostly sensitive to the two‐way travel time of the shear wave in sediment and is robust to seismic noise and small errors in the sediment properties. Our analysis suggests that selectively filtering out the sediment reverberations from ocean seismic data could make inferences on subsurface structure more robust. We expect that this study will enable high‐resolution receiver function imaging of the oceanic plate across the growing ocean bottom seismic arrays being deployed in the global oceans.

     
    more » « less
  8. Abstract Objective

    Electronic medical records (EMRs) can support medical research and discovery, but privacy risks limit the sharing of such data on a wide scale. Various approaches have been developed to mitigate risk, including record simulation via generative adversarial networks (GANs). While showing promise in certain application domains, GANs lack a principled approach for EMR data that induces subpar simulation. In this article, we improve EMR simulation through a novel pipeline that (1) enhances the learning model, (2) incorporates evaluation criteria for data utility that informs learning, and (3) refines the training process.

    Materials and Methods

    We propose a new electronic health record generator using a GAN with a Wasserstein divergence and layer normalization techniques. We designed 2 utility measures to characterize similarity in the structural properties of real and simulated EMRs in the original and latent space, respectively. We applied a filtering strategy to enhance GAN training for low-prevalence clinical concepts. We evaluated the new and existing GANs with utility and privacy measures (membership and disclosure attacks) using billing codes from over 1 million EMRs at Vanderbilt University Medical Center.

    Results

    The proposed model outperformed the state-of-the-art approaches with significant improvement in retaining the nature of real records, including prediction performance and structural properties, without sacrificing privacy. Additionally, the filtering strategy achieved higher utility when the EMR training dataset was small.

    Conclusions

    These findings illustrate that EMR simulation through GANs can be substantially improved through more appropriate training, modeling, and evaluation criteria.

     
    more » « less